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Introduction 

IGF-I  is  a  central  hormone  in  the  regulation  of  anabolic  (growth)  processes  as  a 
function  of  available  energy  and  elementary  substrates  (e.g.,  essential  amino  acids), 
and  has  strongly  mitogenic  and  anti-apoptotic  activities.  Results  from  in  vitro  studies 
and  animal  experiments  show  that,  in  excess,  the  anabolic  signals  by  IGF-I  can 
promote  the  development  of  tumors  at  various  organ  sites,  and  recent  epidemiological 
studies  have  shown  an  increased  breast  cancer  risk  in  women  with  elevated  serum 
IGF-I,  or  with  elevated  levels  of  IGF-I  for  given  levels  of  IGFBP-3,  the  major 
plasmatic  IGF-binding  protein. 

While  nutritional  status  is  one  important  determinant  of  circulating  IGF-I  levels 
(Kaaks  &  Lukanova,  2001),  heritability  studies  have  shown  that,  in  well-nourished 
populations,  a  large  part  (40-60  %)  of  variation  in  IGF-I  is  (co)  determined  by  genetic 
factors  (Hong  et  al.  1996;  Harrela  et  al.,  1996;  Verhaeghe  et  al.,  1996).  To  increase 
understanding  of  what  are  the  major  determinants  of  IGF-I  levels,  as  well  as  cancer 
risk,  we  conduct  a  study  with  the  following  objectives: 

1 .  confirm  that  elevated  prediagnostic  serum  levels  of  IGF-I  increase  breast  cancer 
risk,  especially  in  premenopausal  women; 

2.  describe  exhaustively  existing  polymorphisms,  allele  frequencies  and  haplotypes 
in  15  selected  genes  related  to  the  secretion  of  growth  hormone,  and  hence  to  the 
synthesis  of  IGF-I  and  IGFBP-3;  and 

3.  examine  whether  these  genetic  polymorphisms  are  related  to  significant  increases 
or  decreases  in  circulating  levels  of  IGF-I  and  IGFBP-3,  as  well  as  in  breast 
cancer  risk. 

Our  project  is  a  large  case-control  study  nested  within  the  European  Prospective 
Investigation  into  Cancer  and  Nutrition  (EPIC),  using  prediagnostic  blood  (serum  and 
DNA)  samples  collected  during  1992-1998,  from  233,800  women  in  western  Europe. 

As  mentioned  in  our  original  application,  the  study  was  planned  in  four  parts: 

1 .  A  case-control  study  (about  1 000  cases  and  1 000  controls)  nested  in  a 
prospective  cohort,  to  estimate  the  associations  of  serum  IGF-I  and  IGFBP-3 
levels  with  breast  cancer  risk; 

2.  Preparation  of  an  exhaustive  catalog  of  polymorphisms  and  haplotypes  in  the 
15  selected  candidate  genes,  and  a  (“phase-1”)  association  study  on  a  subset  of 
400  controls  to  identify  genotypes  that  have  a  minimum  level  of  association 
with  serum  of  IGF-I  and  IGFBP-3; 

3 .  A  nested  case-control  study,  to  estimate  relative  risks  of  breast  cancer  in 
relation  only  to  those  genotypes  selected  in  phase- 1 ; 

4.  A  (“phase-2”)  study  of  associations  of  these  selected  genotypes  with  IGF-I  and 
IGFBP-3,  in  all  cases  and  controls. 
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For  year  1.  our  workplan  (as  in  the  “Statement  of  Work”  section  of  our  application), 

was  as  follows: 

1 .  Selection  of  cases  and  controls,  using  the  established  eligibility  and  matching 
criteria,  and  extraction  of  case-control  data  sets  with  relevant  information  from 
questionnaires  and  anthropometry:  Task  1,  months  1-2. 

2.  Retrieval  of  serum  and  buffy  coat  samples  from  the  central  EPIC  storage  facility; 
assembly  of  the  serum  samples  into  batches  of  matched  case-control  sets  for 
immunoassays;  assembly  of  the  buffy-coat  samples  into  batches  for  DNA 
extraction:  Task  2,  months  2-4. 

3.  Assays  of  IGF-I  and  IGFBP-3  serum  of  breast  cancer  cases  (n  =  1000)  and 
controls  (n  =  1000):  Task  3,  months  7-12. 

4.  DNA  extraction  for  all  2000  cases  and  controls:  Task  5,  months  1-12. 

5.  Preparation  of  an  exhaustive  catalog  of  polymorphisms  by  searching  the  literature, 
and  by  DHPLC  analysis  of  DNA  from  a  subset  of 200  subjects:  Task  6:  months 
1-12. 


These  goals  were  mostly  met  entirely: 

■  Task  1:  In  October  2001,  there  were  a  total  of  1852  cases  of  breast  cancer 
diagnosed  after  donation  of  a  blood  sample.  Of  these,  1180  had  complete 
information  about  exogenous  hormone  use  and  menopausal  status  at  the  time  of 
blood  donation.  Among  the  1180  cases  with  complete  information,  we  identified 
810  cases  with  breast  cancer  who  did  not  use  any  hormones  at  the  time  of  blood 
sampling,  and  for  each  of  these  810  women  we  selected  two  control  subjects 
among  all  women  free  of  cancer  until  the  date  of  diagnosis  of  the  index  case,  not 
using  exogenous  hormones  at  the  time  of  blood  donation.  The  control  subjects 
were  matched  to  the  cases  on  study  (recruitment)  center,  and  age  and  date  of 
blood  collection,  menopausal  status,  day  of  the  menstrual  cycle  (in  premenopausal 
women),  and  fasting  status  (time  since  last  consumption  of  food  or  drinks). 

For  430  cases  (women  recruited  in  the  study  centres  of  Naples,  Turin,  Oviedo,  and 
a  number  of  cities  in  France),  the  information  on  exogenous  hormone  use  was  not 
yet  complete  in  our  central  database  at  IARC  (some  data  from  a  blood  collection 
form  still  had  to  be  transferred).  From  amongst  these  women,  an  additional  250 
cases  are  expected  to  be  added  to  the  study,  by  the  end  of  2002,  and  to  these  about 
500  control  subjets  will  be  matched. 

■  Task  2:  For  all  810  breast  cancer  cases  selected  for  the  study,  and  for  1610 
matched  control  subjects,  serum  and  buffy  coat  samples  were  retrieved  from  30 
different  liquid  nitrogen  containers,  and  regrouped.  Samples  were  put  together  in 
batches  for  hormone  (IGF-I  and  IGFBP-3)  assays,  and  buffy  coats  were  set  aside 
for  DNA  extraction. 

•  Task  3:  IGF-I  and  IGFBP-3  were  measured  for  803  cases  and  for  1560  controls. 
Further  assays  are  planned,  before  the  end  of  2003,  for  some  250  cases  still  to  be 
added  to  the  study  (as  described  above),  plus  their  matched  controls. 

■  Task  5:  DNA  has  been  extracted  from  the  buffy  coat  samples  of  1079  controls 
and  600  cases  (August  2002),  and  the  extraction  is  curently  (September  2002) 
being  completed  for  the  full  project  of  around  1050  cases  and  2100  controls. 
Extracted  DNA  samples  (high  concentration)  have  been  out  into  deep-well 
microplates;  from  these,  a  series  of  secodary  plates,  at  a  lower  and  standardized 
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DNA  concentration,  are  being  prepared.  The  secondary  plates  will  be  used  for  the 
preparation  of  PCR  plates  for  genotyping  (years  2  and  3  of  the  project). 

■  Task  6:  As  a  first  step,  we  made  an  exhaustive  catalog  of  polymorphisms  (coding 
and  non-coding  regions),  using  a  search  of  literature  (MEDLINE)  and  publicly 
available  databases,  as  well  as  experimental  discovery  using  denaturing  HPLC 
(DHPLC),  a  technology  available  in  our  laboratory  (Dr  Canzian,  Genome 
Analysis  Group,  IARC).  DNA  samples  of  192  healthy  control  subjects  were  used 
for  the  discovery  of  polymorphisms  by  DHPLC,  including  137  Caucasians  (from 
Sweden,  Estonia,  Germany,  Romania,  Spain,  Basque  country  and  Greece),  43 
Africans  and  12  Japanese.  SNP  searches  through  literature,  databases  and  DHPLC 
led  us  to  complete  a  catalogue  of  127  polymorphisms. 

After  the  SNP  discovery  step,  we  selected  polymorphisms  of  interest  for  the  study, 
on  the  basis  of  allelic  frequencies  and/or  knowledge  about  a  possible  functional 
role.  A  DNA  microarray  (“IGF  chip”)  was  designed,  for  a  total  of  78  SNPs  in  our 
list  of  candidate  genes  (Table  1). 


Table  1. 


Gene 

Number  of  SNPs 
on  chip 

Number  of  SNP 
that  passed  Q.C.a) 

Number  of  SNPs 
suitable  for  haplotypingb) 

GH1 

9 

5 

2 

GHR 

8 

7 

3 

GHRL 

4 

3 

3 

GHRH 

2 

2 

2 

GHRHR 

10 

10 

9 

IGF1 

7 

6 

4 

IGF1R 

2 

2 

2 

IGFBP1 

4 

4 

4 

IGFBP3 

8 

6 

5 

POU1F1 

3 

2 

1 

SST 

1 

1 

1 

SSTR1 

1 

1 

1 

SSTR3 

8 

8 

6 

SSTR4 

5 

4 

4 

SSTR5 

5 

5 

4 

IGFALS 

1 

0 

0 

Total 

78 

66 

51 

a)  SNPs  that  could  be  typed  reliably  with  the  chip  technology.  The  criteria  for 
quality  control  were  that  the  three  genotype  classes  (homozygote  for  the 
common  allele,  htererozygote,  homozygote  for  the  rare  allele)  should  be 
observed  and  that  they  should  be  in  Hardy-Weinberg  equilibrium 

b)  SNPs  were  considered  suitable  for  haplotyping  if  they  passed  quality  control 
and  had  frequency  of  the  rare  allele  >5%  in  this  population. 

Workplan  for  year  2. 

During  year  1,  we  have  been  able  to  start  already  with  some  of  the  work  originally 
scheduled  for  year  2.  The  workplan  for  year  2,  as  stated  in  our  original  proposal 
includes: 
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6.  Complete  genotyping  of  a  subset  of 400  controls;  Statistical  analysis  of  a  phase-1 
association  study,  relating  genotypes  to  serum  concentrations  of  IGF-I  and 
IGFBP-3:  Task  7  in  our  original  workplan,  months  12-24; 

After  the  SNP  discovery  step  (task  nr  6,  described  above)  we  selected  polymorphisms 
of  interest  for  the  study,  on  the  basis  of  allelic  frequencies  and/or  knowledge  about  a 
possible  functional  role.  A  DNA  microarray  (“IGF  chip”)  was  designed,  for  a  total  of 
78  SNPs  in  our  list  candidate  genes.  This  microarray  was  then  used  to  type  these 
SNPs  in  a  cross-sectional  study  population  of 249  women  and  228  men  [men  were 
also  typed,  as  their  DNA  was  readily  available  in  extracted  form,  and  because  allele 
frequencies  and  genetic  haplotypes  are  the  same  in  men  and  women  (none  of  our 
candidate  genes  are  X-linked)].  In  the  same  cross-sectional  study  we  also  measured 
IGF-I  and  IGFBP-3,  so  that  we  could  perform  a  first  analysis  of  associations  of 
polymorphisms  with  these  two  hormonal  parameters. 

Table  2  shows  results  of  this  cross-sectional  study,  which  was  used  to  describe  SNP 
allele  frequencies  and  SNP  haplotypes  in  this  sub-population  of  EPIC.  For  a  number 
of  SNPs,  the  quality  of  measurement  was  insufficient,  while  for  others  (initially 
identified  from  public  database  without  good  prevalence  data)  we  found  that  the 
prevalence  was  very  low  (<  1%)  in  our  study  population.  Furthermore,  a  few  SNPs 
were  in  close  to  perfect  linkage  disequilibrium  (A>0.90)  with  others.  After  elimination 
of  these  various  non-informative  SNPs,  a  total  of  51  SNPs  remained  for  estimation  of 
major  haplotypes,  and  for  statistical  analyses  of  association  with  IGF-I  and  IGFBP-3 
levels.  Haplotypes  at  the  individual  level  were  predicted  with  custom  software 
prepared  by  us  (Cox  et  al.,  manuscript  submitted). 

We  performed  preliminary  statistical  analyses  of  associations  of  SNPs,  or  their 
combinations  into  haplotypes,  with  serum  IGF-I  and  IGFBP-3. 

At  the  level  of  gene  loci,  a  first  approach  to  assess  associations  of  SNPs  with  IGF-I 
and/or  IGFBP-3  levels  was  to  estimate  the  maximum  percent  of  variation  in  these 
peptides  that  could  be  explained  by  the  individuals’  combinations  of  haplotypes  on 
the  two  chromosomes  (i.e.,  multi-allelic  genotypes,  where  each  haplotype  represents  a 
specific  allele  type)  (Table  2). 

In  subsequent  steps  we  refined  the  analyses,  to  examine  which  reduced  sets  of  SNPs 
could  explain  most  of  this  variation,  using  an  approach  recently  described  by  Cordell 
and  Clayton  (2001).  This  method  uses  stepwise  regression  procedures  (forward 
selection  and  backward  elimination  strategies),  to  evaluate  the  relative  importance  of 
different  SNP  variants,  alone  or  in  variable  combinations,  within  a  gene.  Main  effects, 
as  well  as  their  possible  interactions,  are  evaluated  for  dummy  variables  representing 
phased  SNP  genotypes  within  each  gene  locus,  where  the  (parental)  phase  indicates 
whether  or  not  different  SNP  alleles  occur  together  on  the  same  chromosome.  By 
including  interaction  terms  between  the  SNP  genotype  variables,  it  is  possible  to 
assess  the  effects  of  reduced  haplotypes  within  a  gene,  composed  of  minimum  sets  of 
SNPs  that  lead  to  significant  prediction  of  plasma  peptide  levels  or  disease  risk.  This 
preliminary  analysis  helped  us  select  27  SNPs  out  of  the  total  of  51)  for  which 
associations  with  IGF-I  and  /  or  IGFBP-3  were  most  likely. 

The  results  from  the  more  refined  analyses  (Table  2)  provide  preliminary  evidence 
that  circulating  levels  of  IGF-I  may  be  associated  with  polymorphic  variation  in 
several  genes  in  the  GH/IGF-I  pathway,  including  IGF1,  IGFBP3,  GHR,  GHRH, 
GHRHR,  SSTR3  and  GHRL.  The  genes  SST,  SSTR1,  POU1F1  and  GHSR  were  found 
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to  be  relatively  monomorphic  (only  one  SNP  remained  for  analyses),  and  only  in  the 
GHSR  gene  did  one  polymorphism  show  a  significant  association  with  levels  of  IGF-I 
and  IGFBP-3. 

With  current  numbers  of  observations,  associations  observed  were  only  of  borderline 
statistical  significance,  and  effects  of  individual  SNP  variants  on  levels  of  IGF-I  or 
IGBP-3  were  always  relatively  small.  Nevertheless,  selected  SNPs  explained  some 
20  percent  of  the  total  between-subject  variation  in  IGF-I  levels.  This  would 
correspond  to  about  40%  of  the  genetically  determined  variation,  if  one  assumes  that 
an  approximate  50%  of  the  total  variation  is  due  to  heritable  factors,  as  indicated  by 
twin  studies. 

A  larger  cross-sectional  study  will  be  needed  to  allow  the  estimation  of  such 
multigenic  prediction  score  with  sufficient  precision.  In  year  2,  the  number  of  women 
in  the  cross-sectional  analysis  will  be  increased,  and  preliminary  studies  will  be 
extended  according  to  the  plans  in  our  initial  proposal,  for  a  selected  series  of  15-25 
SNPs.  This  full  study,  of  all  1050  breast  cancer  cases  plus  their  matched  controls,  will 
allow  a  much  more  precise  and  statistically  powerful  analysis  of  associations  with 
both  IGF-I  levels  and  breast  cancer  risk. 
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Table  2.  Plasma  IGF-I  and  IGFBP-3  in  relation  to  polymorphic  variation  in  candidate 
genes,  among  477  men  and  women  in  Northern  Sweden. _ _ 


Number  of  different  alleles 


R  of  full  rank 
(“maximum”) 
haplotype 
model  b) 


Result  from  stepwise 
regression: c) 

Nr  ofSNP  loci 
showing  effects  / 
model  p-value  / 
model  R2 


IGFBP3 


GHRL 

4 

9 

0.08 

0.06 

2  /  p  = 

0.07/ 

r2  = 

0.05 

GHSR 

1 

NA 

0.02 

0.02 

l/p  = 
0.01/ 

R2  = 

0.02 

&  — 

I!  II 

O  O 

o  b 
to  -j 

IGF1R  d) 

2 

3 

NA  0) 

0.01 

NAd> 

IGFBP1  d) 

4 

9 

NAd) 

NA  d) 

NA  d) 

“NA35 

a>  SNP  alleles  with  frequency  below  1%  are  not  counted.  b)  Percent  of  variation  in  plasma  peptide 
levels  explained  by  a  foil-rank  model  including  all  haplotype  combinations  on  the  individuals’  two 
chromosomes  (i.e.,  individuals’  multi-allelic  genotypes  at  a  gene  locus).  ^  Number  ofSNP  loci  within 
each  gene  that  predict  variation  in  peptide  levels,  either  as  main  effect  or  in  interaction  with  other 
SNPs;  model  obtained  by  stepwise  procedures  based  on  a  combination  of  forward  selection  (p w  <0.15) 
and  backward  elimination  (p oc/r>  0.15),  using  phased  SNP  genotypes.  d)  Genes  for  which  a  direct 
association  with  levels  of  IGF-I  or  IGFBP-3  is  physiologically  less  plausible.  NA  =  “not  applicable”. 
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Key  Research  Acomplishments 

Key  accomplishments  in  year  1  were: 

an  almost  full  selection  of  cases  and  controls  within  the  EPIC  cohorts,  for  the 
nested  case-control  study  on  serum  IGF-I,  IGFBP-3  and  breast  cancer  risk;  a 
smaller  number  of  cases  and  controls  is  being  added  to  the  study,  so  as  to  reach 
the  full  target  study  size. 

-  Measurement  of  IGF-I  and  IGFBP-3  for  the  cases  and  controls 

-  Close-to  complete  DNA  extraction  for  the  cases  and  controls  (being  finalized  in 
September-October  2002) 

-  Identification  of  a  comprehensive  catalog  of  SNPs  in  the  candidate  genes  included 
in  the  present  study 

-  Preparation  of  DNA  genotyping  chip,  for  78  of  these  polymorphisms 

-  A  first  descriptive  study  of  SNP  and  haplotype  frequencies  for  all  candidate  genes 

-  A  first  analysis  of  associations  of  SNPs  an  circulating  levels  of  IGF-I  and  IGFBP- 
3. 


Reportable  Outcomes 

A  first  report  documenting  SNPs  identified  in  our  list  of  candidate  genes  is  in 
preparation. 


11 


Conclusions 


Our  study  has  started  without  problems,  and  is  fully  on  schedule. 

Preliminary  results  suggest  that  an  important  part  of  between-subject  variation  in 
circulating  IGF-I  levels  may  be  due  to  polymorphic  variations  in  our  list  of  candidate 
genes;  the  planned  extension  of  our  study  to  a  larger  number  of  women  (breast  cancer 
cases  and  controls)  is  needed  to  confirm  this. 
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Appendix 


Catalogue  of  polymorphisms  found  by  literature  searches 
and  experimentally  in  the  15  candidate  genes. 
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